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PATENT APPLICATION 
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re application of 

Hans-Jurgen MATT, et al. Attorney Docket Q61703 

Appln.No.: Not yet assigned Group Art Unit: Not yet assigned 

Filed: November 21, 2000 Examiner: Not yet assigned 

For: EXPONENTIAL ECHO AND NOISE REDUCTION IN SILENCE INTERVALS 

PRELIMINARY AMENDMENT 

Assistant Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Prior to examination, please amend the above-identified application as follows: 

TN THE SPECIFICATION: 

Page 1, after the title insert the heading -Background of the Invention--. 

Page 5, after line 9 (not including paragraph spacing) insert the heading -Summary of 
the Invention—. 

Page 6, after line 1, insert: 

-Brief Description of the Drawings 

The invention will be more clearly understood from the following detailed description n 
conjunction with the accompanying drawings, wherein: 

Fig.l shows the control signal a*, in the presence of speech signals, during a silence 
interval, and when the speech signal resumes; 

Fig.2 shows a scheme of an arrangement for controlled signal attenuation; 



PRELIMINARY AMENDMENT 
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Fig.3a shows the function g(S/N) in linear approximation; 

Fig.3b shows the corresponding function g f (N/S); 

Fig.4a shows the function g(S/N) as a skewed bell curve, and 

Fig.4b shows the corresponding function g'(N/S). 

Detailed Description of the Invention - 
Page 15, delete lines 18-22 in their entirety. 
Page 16, delete lines 1-4 in their entirety. 

§ m THE CLAIMS: 

M Claim 4, line 1, delete "any one of the preceding claims" and insert -claim 1~. 

Claim 5, line 1 , delete "any one of the preceding claims" and insert -claim 1~. 

1 y Claim 6, line 1, delete "any one of claims 1 to 4" and insert -claim 1-. 

is 

y. Claim 9, line 1, delete "any one of claims 6 to 8" and insert -claim 6--. 

U Claim 10, line 1, delete "any one of claims 6 to 8" and insert -claim 6-. 

13 Claim 11, line 1, delete "any one of claims 6 to 10" and insert -claim 6-. 

Claim 12, line 1, delete "any one of the preceding claims" and insert -claim 1- 

13. (Amended) A method as claimed in claim 12 [and in any one of claims 6 to 11], 
characterized in that during a silence interval and/or in the presence of an echo signal and for 
a nfk) < c 2 . where c? is a predefined constant, the power value of the noise level N in the 
communications channel currently being used is continuously measured and /or estimated, and 
that depending on the current noise level N. the control signal a n fk+1) is continu ously adjusted 
according to anfk+1) = f(NV where f(N) is a predetermined function of R said method further 
characterized in that the control signal a 0 (k+l) is continuously adjusted according to ao(k+l) = 
h(N, S, ES, t e , ERL), where h(N, S, ES, t e , ERL) is a predetermined function of the noise level 
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N, the signal level S, the useful signal ES in the opposite direction from a speaking party, the 
constant delay t e of the echo signal, and an attenuation constant ERL of the amplitude of the 
echo signal 



Claim 15, 


line 1, 


delete 


"any one of claims 12 to 14" and insert -claim 12-. 


Claim 18, 


line 1, 


delete 


"any one of the preceding claims" and insert -claim 1-, 


Claim 20, 


line 1, 


delete 


"any one of the preceding claims" and insert -claim 1- 


Claim 21, 


line 1, 


delete 


"any one of the preceding claims" and insert -claim 1- 


Claim 22, 


line 1, 


delete 


"any one of claims 1 to 21" and insert -claim 1- 


Claim 23, 


line 1, 


delete 


"any one of claims 1 to 21" and insert -claim 1-. 



TN THE ABSTRACT; 

Change the heading from " Summary " to - ABSTRACT -. 

After the abstract, delete "(Fig. 1)". 

REMARKS 

Entry and consideration of this Amendment is respectfully requested. 



SUGHRUE, MION, ZINN, 

MACPEAK & SEAS, PLLC 
2100 Pennsylvania Avenue, N.W. 
Washington, D.C. 20037-3213 
Telephone: (202) 293-7060 
Facsimile: (202) 293-7860 

Date: November 2], 2000 



Respectfully submitted 




David J. Cushing 
Registration No. 28,703 
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Exponential echo and noise reduction in silence intervals 



A method of reducing echo and/or noise signals in telecommunications 
systems for transmitting useful acoustic signals, particularly human speech, 
comprising determining by silence detection when the mixture of useful 
signals and interference signals contains a speech signal or when a silence 
interval is present, and varying, by means of a two-input multiplier, the 
amplitude of the useful signals, which are generally disturbed by echo 
and/or noise signals, in response to a time-dependent control signal a 0 (t) or 
a control signal a 0 (k) clocked at a sampling rate f T — 1/T, where keH 
denotes the number of samples, and T denotes the period from one sample 
to the next. 

Such a method is known, for example from DE 42 29 912 Al. 

During natural communication between people, as a rule the amplitude of 
the spoken word is automatically adapted to the acoustic environment. 
However in remote spoken communication the speaking partners are not in 
the same acoustic environment, so neither is aware of the acoustical 
situation at the location of the other. The problem occurs particularly 
acutely when one of the partners is compelled by his acoustic surroundings 
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to speak very loudly, while the other partner is in a quiet acoustic 
environment and is producing speech signals of lower amplitude. 

A further problem is that on a TK channel some noise of "electronic origin" 
is produced and this is co-transmitted as a background to the useful signal. 
Furthermore, it is also advantageous to attenuate or completely suppress 
distorting signals such as undesired background noise (noise from the 
street, the factory, the office, the canteen, aircraft noise, etc.). To enhance 
comfort while telephoning, it is generally attempted to keep every type of 
noise as low as possible. 

Finally, in TK communications there also occur so-called echoes, which are 
present in two-wire TK networks as line echoes and can for example appear 
in simple and less comfortable TK terminals in the form of acoustical 
echoes. 

In general therefore, in the transmission of a mixture of speech signals and 
distorting signals, it is important to reduce the amplitude of distorting 
signals such as noise and echoes as much as possible. 

A known method for noise reduction is the so-called "spectral subtraction", 
as described for example in the publication "A new approach to noise 
reduction based on auditory masking effects" by S. Gustafsson and P. Jax, 
ITG Technical Conference, Dresden, 1998. This involves a spectral noise- 
reduction method in which an acoustic masking threshold (for example 
according to the MPEG Standard) is taken into account. The disadvantages 
of such methods are that determination of the said acoustic masking 
threshold is an elaborate process and that carrying out all the operations 
associated with the method entails considerable computational effort. 
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In spectral subtraction the noise in speech pauses is first measured and 
stored continuously in a memory in the form of a power density spectrum. 
The power density spectrum is obtained via a Fourier transformation. 
When speech occurs, the stored noise spectrum is subtracted as a "best 
current estimated value" from the actual distorted 

speech spectrum and then back-transformed in the same time area, so that 
in this way a noise reduction for the distorted signal is obtained. 

A further disadvantage of spectral subtraction is that by virtue of the process 
of noise estimation and subsequent subtraction which are inexact in 
principle, defects occur in the output signal which are noticeable as 
"musical tones". In addition, this known method is hardly appropriate for 
the suppression of echo signals in TK communication links. 

In the extended spectral signal processing also described in the reference 
cited above, with the help of spectral subtraction the power density spectra 
for the noise and for the speech itself are first estimated. From a 
knowledge of these part-spectra, with the help for example of the rules of 
the MPEG Standard, a spectral acoustic masking threshold R T (f) for the 
human ear is then calculated. With the help of this masking threshold and 
the estimated spectra for noise and speech, a simple rule is then applied to 
compute a filter pass curve H(f) which is designed such that essential 
spectral portions of the speech are let through as unchanged as possible, 
while spectral portions of the noise are attenuated as much as possible. 

The original distorted speech signal then need only be passed through this 
filter to obtain a noise reduction for the distorted signal. The advantage of 
the method is now that "nothing is added to or subtracted from" the 
distorted signal, so estimation errors have little perceptible effect or hardly 
any at all. The disadvantages are again the considerable computational 
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effort for spectral noise suppression and the need for upstream connection 
of an adaptive filter for echo suppression. 

In the known compander method, as described for example in the patent 
DE42 29 912 AT cited earlier, the degree of noise and echo attenuation is 
established in accordance with a fixed predetermined transfer function 
which, among other things, effects a level reduction even in the case of very 
small input signals. 

The compander first has the property of transmitting speech signals with a 
given (previously set) "normal speech signal level" (sometimes called the 
normal loudness) virtually unchanged from its input to the output. 

If, now, the input signal is ever too loud, for example because a speaker 
comes too close to his microphone, a dynamic compressor limits the output 
level to almost the same value as in the normal case, in that the actual 
amplification in the compander is linearly reduced as the input signal 
becomes louder. Thanks to this property, the speech at the output of the 
compander system remains at approximately equal loudness regardless of 
how marked is the fluctuation of the input loudness. 

On the other hand, if a signal with a level lower than normal is fed to the 
input of the compander, the signal is additionally damped in that the 
amplification is cut back so as to transmit background noise only in 
attenuated form so far as possible. 

Thus, the compander consists of a compressor for speech signal levels 
higher than or equal to a normal level, and an expander for signal levels 



23.1 0.2000 ZPL/S-Mr/K6 anmeide.doc 



111 705 



5 



lower than the normal level. In this, the amplification reduction in the 
expander is more marked the lower is the input level. 

A disadvantage of the compander solution is the considerable 
computational effort required to carry out the known process. Besides, the 
compression of the speech signal level on the one hand and its expansion 
on the other hand give rise to a modulation in the loudness of the speech, 
which changes the speech signal in such a way that the result is often 
perceived subjectively as unsatisfactory, i.e. it creates an unsatisfactory 
auditory impression. 

The purpose of the present invention, in contrast, is to propose a method 
having the characteristics described at the start, by means of which, in the 
least elaborate and most cost-effective way possible and without major 
computational effort and reduced need for computer memory and data 
storage space, echo and noise attenuation is achieved by using simple 
means to produce an overall acoustic impression as pleasant as possible 
for the human ear, which can in addition be adapted to individual needs 
according to taste. 

According to the invention this objective is achieved in a manner as simple 
as it is effective, by varying the control signal a Q (t) or a 0 (k) in such a way that 
during the presence of speech signals in the useful signal the amplitude of 
the control signal a 0 (t) or a c (k) is set to a predetermined constant 
amplification value c Q and when a silence interval begins in the useful 
signal the amplitude of the control signal a 0 (t) or a G (k) is continually reduced 
from one sample value to the next in accordance with the recurrence 
formula: 

a 0 (k + 1) = a c (k).p where p < 1 
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and after the end of a silence interval a c (k) is again restored to c D . 

This provides a very simple and cost-effective method, which also achieves 
surprisingly good quality in relation to the reduction of distortion since it 
preferably attenuates the distorting echo and noise signals during silence 
intervals. During the speaking phases themselves, the distorting noise is at 
least partially masked and therefore obviously perceived by the human ear 
to a far smaller extent. By doing without compression according to the 
known compander method, the original speech signal is considerably less 
changed so that, as a result, a speech signal which as a rule sounds better 
at the other end of the line is obtained. In addition, the method according 
to the invention requires less computing power than the compander 
method, since at least the compression is omitted. Correspondingly, 
smaller capacities are needed for data storage and computer memory, and 
compared with the known method this makes the method according to the 
invention both simpler and cheaper. 

To achieve effective noise attenuation, during silence intervals the power of 
the signal to be transmitted is reduced in accordance with a time- 
exponential function, in contrast to a reduction that depends on the input 
level as in the compander method. This already achieves appreciable noise 
attenuation, and in addition a reduction of noise during a silence interval is 
clearly less stressful for the hearing since it considerably reduces the 
deafening effect that occurs after loud noise. When speech is resumed the 
ear can react more sensitively and listen more accurately. 

Advantageously, the factor (5 is chosen such that the continuous time 
reduction corresponds approximately to a time constant x 7 of the 
perceptiveness of the human ear. This means that after a powerful noise 
stimulus, the human ear does not perceive new noise stimuli after the end 
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of the powerful sound stimulus which are in time and amplitude below a 
variation curve that attenuates with time constant x,. A variant of the 
method according to the invention is therefore preferred, in which the factor 
P is determined from the sampling rate f T , a time constant t u and a 
predefined constant factor c u according to the relation 
P = c x - expC-l/^/r). 

In man, the time constant x } is chosen to be between 50 ms and 150 ms, 
preferably t 1 « 65 ms. 

To dimension the factor p accurately in accordance with the time constant 
x-j, it is best to choose c G = 1 . 

If the continuous exponential attenuation of the distortion signal according 
to the aforesaid recurrence formula is not limited, the value of a Q (k) will very 
rapidly become fairly small as k increases, approaching zero. This, 
however, is not always desired since in many cases people like to hear a 
low level of residual noise so that during a speech pause the impression will 
be avoided that the TK line has suddenly "gone dead" or been interrupted. 
It is therefore preferable to have a variant of the method according to the 
invention in which during a silence interval and/or in the presence of an 
echo signal a 0 (k+l) assumes a predefined constant value c 2 if the 
preceding value a 0 (k) has become less than or equal to c 2 . 

Further, it is desirable to adapt the degree of signal level reduction during 
silence intervals to the momentary situation in the TK channel. 

For example, noise can preferably be reduced as a function of the 
momentary noise level N or in a way that depends on a function g(S/N) of 
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the signal-to-noise difference S/N, but short-time echoes can be reduced 
more strongly and, after the end of the echo, the reduction can be restored 
to the lesser value used for noise reduction. 

It is therefore particularly preferable to apply a method variant 
characterised in that during a silence interval and/or in the presence of an 
echo signal and for a 0 (k) < c 2 , where c 2 is a predefined constant, the power 
value of the noise level N in the communications channel currently being 
used is continuously measured and/or estimated, and that depending on 
the current noise level N, the control signal a 0 (k+l) is continuously adjusted 
according to a 0 (k-f 1) = f(N), where f(N) is a predetermined function of N. 

In this way the degree of noise attenuation is automatically controlled as a 
function of the power N of the noise actually occurring and adapted to the 
momentary noise value in the telephone channel, being followed in a 
predetermined and defined way. Via the choice of the function of f(N) the 
subjective impression of the overall signal produced can also be adapted. 
Another advantage of this method variant is that in the case of a bundle of 
telephone channels, for example between international communication 
stations, the noise situation in each individual channel, which may very well 
be quite different from one channel to the next, can be automatically 
adjusted and optimised individually. 

Particularly preferred is a variant of the method according to the invention 
characterised in that the predetermined function f(N) is a function g(S/N), 
which depends on the quotient S/N of the power value of the signal level S 
of the useful signals to be transmitted and the power value of the noise 
level N, or that the predetermined function f(N) is a function g'(N/S), which 
depends on the reciprocal of said quotient. For reasons of simpler practical 
realisation, a function of (S + N)/N or (S -f N)/S can also be used. 
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The advantage of the above method variant is that if the useful signal level 
S in the telephone channels of a bundle is varying markedly, the correct 
adjustment for noise reduction will always be found. If the noise 
attenuation is controlled proportionally to the reciprocal N/S, the function 
g'(N/S) can easily be implemented on a digital signal processor (= DSP) 
with fixed computer word lengths for example of 1 6 bits using particularly 
simple software, since for N/S a numerical range 0 < N/S < 1 is mainly 
relevant or of interest for controlling the noise reduction. 

Acoustic listening tests have shown that with S/N — 0 dB speech is clearly 
so distorted that the noise may only be reduced by a value f 0 or g Q between 
5 and 10 dB, preferably between 6 and 8 dB, to a limited extent if 
degradation of the overall acoustic impression in relation to natural- 
sounding speech is to be avoided. At even less favourable values of the 
signal-to-noise ratio S/N < 0 dB, the value f 0 or g G can be retained since 
any further noise reduction only worsens the overall impression. 

According to these investigations, at mean S/N values the noise reduction 
can be more pronounced. In this, there is a maximum in the range 10 to 
15 dB. The value of the noise attenuation f max or g max should amount at the 
maximum to between 20 and 30, preferably about 25 dB. 

With very good noise values such that S/N > 40 dB, only a minimal 
reduction between 0 and 3 dB should be effected so that the naturalness of 
the speech transmitted is kept as good as possible. 

The sound of the speech and its understandability are particularly good 
when the function f(N) or g(S/N) is coherent in a continuous way beyond 
the three ranges discussed above, whereby rapid changes in N or in S(N) 
can be smoothed by filtering. 
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This is relatively simple to realise in terms of hardware and/or software, 
since the functions f(N) or g(S/N) or g'(N/S) are approximated by straight 
characteristic line sections between the three aforesaid operating points 
(sectional linear approximation). 

In a somewhat more elaborate variant of the method according to the 
invention, but one whose result is a better sound picture, a polynomial 
function is used to implement the continuous functions f(N) or g(S/N) or 
g'(N/S) in the three ranges discussed, which as a result leads to a type of 
skewed bell function. 



Especially preferable is a variant of the method according to the invention 
in that the functions f(N) and g(S/N) or g'(N/S) are chosen such that the 
reduction of the noise level N is aurally compensated in accordance with 
the psychoacoustic mean value of the spectrum audible by the human ear. 
In this, the value for S and/or N is determined not solely from the 
momentary power, but also from a weighted spectral variation of S or N 
respectively, and overall via the function so obtained a noise reduction 
appropriate for audition, i.e. one which sounds psycho-acoustically 
pleasant, is achieved. Since there is no simple measure for a noise 
reduction that sounds acoustically pleasant, all the quality assessments in 
extensive listening tests are taken into account and subsequently evaluated 
by statistical methods optimised for the purpose, in order to obtain an 
evaluation scale (similarly to the case of speech codecs). 

Good noise level estimation necessitates a good silence interval detector, 
since only then can one be sure that in the silence intervals only distorting 
noise is present without any mixing at all between noise and snatches of 
speech, as is often the case in practice. 
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For that reason a method variant is especially to be preferred which is 
characterised in that in a silence detector (SPD), a short-time output signal 
sam(x), a medium-time output signal mam(x), and a long-time output 
signal lam(x) are formed by means of a short-time level estimator, a 
medium-time level estimator, and a long-time level estimator, respectively, 
that the three output signals sam(x), mam(x), and lam(x) are so adjusted via 
suitable amplification coefficients that they are approximately equal in 
magnitude when the input signal x is a pure noise signal, with sam(x) 
<mam(x) <lam(x) / that the three output signals sam(x), mam(x), and lam(x) 
are monitored by comparators, and that the presence of a speech signal as 
the input signal x is assumed when both sam(x) and mam(x) first become 
larger than lam(x), while the presence of a silence interval is assumed when 
thereafter sam(x) and/or mam(x) become smaller than lam(x). 

With the help of this relatively simple type of formation of various mean 
values of the time signal, surprisingly good silence interval detection can 
already be achieved, which requires only very little computational effort. 

A further development of this method variant provides that for silence 
interval estimation, the three output signals sam(x), mam(x), and lam(x) are 
fed to a neural network which was trained with a plurality of scenarios with 
different input signals x. A neuronal network can advantageously picture 
linear and non-linear relationships between a large number of input 
parameters and the desired output values. A prerequisite for this is that the 
neuronal network has first been trained with a sufficient quantity of input 
values and associated output values. Thus, neuronal networks are 
particularly well suited for the task of silence interval detection in the 
presence of various kinds of distorting noise. 

Preferably, besides the recognition and reduction of noise signals, the 
presence of echo signals will also be detected and/or predicted and the 
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corresponding echo signals suppressed or attenuated. When in a 
telephone channel echoes occur in addition to noise, these can as a rule be 
predicted by virtue of a previously determined signal persistence time x E of 
an echo and the previously determined echo coupling ERL in the channel 
and the signal strength ES that triggers the echo in the return channel. This 
estimation can be carried out in such a way that as a function of the speech 
signal emitted and its momentary power, the size of the delayed echo is 
estimated. If the echo signal estimated in each case exceeds a 
predetermined threshold value thrs within determined short time segments, 
this echo-affected signal is preferably additionally damped for a short time, 
for example by means of the above-mentioned exponential attenuation, to 
a value necessary for an essential reduction of the echo signal. In the same 
sense, when echoes are present a compander characteristic curve can for a 
short time be displaced in the direction of greater input loudness and, once 
the echo has died away, it can be moved back to its original position. 

Especially preferred is a further development of this method variant in that 
the control signal a 0 (k+l) is continuously adjusted according to a 0 (k+ 1} = 
h(N, S, ES, t e , ERL), where h(N, S, ES, x E , ERL) is a predetermined function of 
the noise level N, the signal level S, the useful signal ES in the opposite 
direction from a speaking party, the constant delay x E of the echo signal, 
and an attenuation constant ERL of the amplitude of the echo signal. 

Advantageously, a noise reduction appropriate for audition can be 
combined with an echo reduction independent of it. This is particularly 
important when there is virtually no background noise in the telephone 
channel, since there is then no noise attenuation and echo signals that 
occur can therefore reach the caller unimpeded. 

Separation of the control of noise reduction from that of echo attenuation is 
appropriate, since noise and echoes occur independently of one another 
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and are also typically caused by completely different physical effects. 
However, a general reduction function R can be generated mathematically, 
which describes an attenuation of signal levels for both noise and echoes: 

R(S, N, ES, x E/ ERL, thrs) ~ g(S/N) . d(ES, r E/ ERL, thrs) 

in which g(S/N) is the noise reduction described earlier and d(...) denotes 
the independent additionally occurring echo attenuation when the 
estimated echo signal exceeds the predetermined threshold value thrs. 

Particularly advantageous is a method variant in which during the time of 
an echo reduction, an artificial noise signal is added to the useful signal. 

At constant noise level, a noise attenuation is also constant. A suddenly 
occurring additional echo reduction in the speech rhythm means that there 
will also be a noise attenuation in the speech rhythm (at least in the short 
time segment). This leads to pulsed background noise which does not 
sound natural. It is therefore advantageous, at the instants when additional 
echo reduction takes place, to add to the processed signal a synthetic noise 
from a suitable noise generator of about the same magnitude as normal 
background noise. This results in background noise for the listener which is 
as constant as possible. 

The noise generator can be designed such that the artificial noise signal 
comprises an acoustic signal sequence psycho-acoustically perceived as 
pleasant (= comfort noise). 

Instead of synthetic background noise, however, a section of previously 
occurring real background noise of appropriate strength can be introduced 
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during the echo-time segments. The added noise is then virtually no 
different from the previous noise and therefore results in no distorting 
acoustical variation for the listener. 

The addition of noise to the acoustic masking of effects and the measures 
for separate treatment of noise and echoes, when these are correctly 
matched to one another, result in a particularly understandable and 
pleasant speech impression even in "difficult" environments (echoes plus 
noise). 

Particularly preferable is also a variant of the method according to the 
invention, in which the useful signal to be transmitted is subjected to a 
spectral subtraction. The advantage of spectral subtraction with subsequent 
level attenuation during the speech pauses is that first, by spectral 
subtraction, part of the distorting noise is eliminated from the speech signal 
itself, and only after this are the speech pauses freed from noise and 
echoes in the manner described. Overall, in subjective tests this 
combination gives better listening impressions than simple spectral 
subtraction alone. 

Finally, a further particularly advantageous variant of the method according 
to the invention provides that the useful signal to be transmitted is subjected 
to spectral filtering adapted to the sense of human hearing. Here too, with 
the means of spectral subtraction an estimate of noise, speech and echoes 
is first carried out, a masking threshold appropriate for audition is then 
determined, and the whole signal is then processed via an appropriately 
adjusted transmission filter such that the speech fraction is as undistorted as 
possible and the echo and noise fractions are suppressed to as large an 
extent as possible. 
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A combination with the subsequent level attenuation during silence intervals 
improves the listening impression still further. 

The scope of the present invention also includes a server unit to support the 
method according to the invention described above, and a computer 
program for implementing the method. The method can be realised both 
as hardware circuit and in the form of a computer program. Nowadays 
software programming for a powerful DSP is preferred, because new 
knowledge and additional functions can be implemented more easily by 
modifying the software on an existing hardware basis. However, processes 
can also be implemented as hardware modules, for example in TK 
terminals or telephones* 

Further advantages of the invention emerge from the description and 
figures. Likewise, the characteristics mentioned earlier and any indicated in 
what follows can in each case be applied individually as such, or several 
together in any combinations. The embodiments indicated and described 
are not to be understood as exclusive, but rather, as examples which 
illustrate the invention. 

The invention is illustrated in the figures and will be described in more 
detail with reference to example embodiments. The figures show: 

Fig.1 : The control signal a 0 in the presence of speech signals, during 

a silence interval, and when the speech signal resumes 

Fig.2: Scheme of an arrangement for controlled signal attenuation 
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Fig. 3a: The function g(S/N) in linear approximation 

Fig. 3b: The corresponding function g'(N/S) 

Fig.4a: The function g(S/N) as a skewed bell curve, and 

Fig.4b: The corresponding function g'{N/S). 

The control signal a Q shown in Fig.l as a function of time t and sample 
number k is kept at a value c Q = 1 during a first phase Tl in which speech 
signals are detected. During a silence interval in the time segment T2 the 
control signal a 0 is reduced to a constant value c 2 slightly above 0, and 
then, when the speech signal resumes during a phase T3, it is sharply 
increased again to the value c 0 = 1 (or to some other, freely selectable 
constant). Consequently, during the speech phases Tl , T3 there is no (or in 
other examples only a slight) suppression of distorting signals in the overall 
signal, so that the speech signal is transmitted as unmodified and as 
unimpeded as possible. During the silence interval in phase T2, the most 
effective suppression of echoes and noise signals is implemented as quickly 
as possible (exponentially), although in the present example these are 
attenuated not to 0 but to a small residual value c 2/ to avoid creating the 
impression of a "dead" line at the other end. When echoes occur, 
attenuation takes place down to a residual value of 

c 3 < c 2 . 

Fig. 2 illustrates schematically the functional mode of an arrangement for 
noise and echo reduction with a silence interval detector, corresponding to 
the above-mentioned reduction function R(S, N, ES, t e , ERL, fhrs). 
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For all the curves shown in Figs.3a to 4b, the function value g or g' for the 
case in which S/N < 0 dB, i.e. when the noise background is extremely 
high, changes to a constant value g e of the noise reduction equal to 
approximately 6 dB. Starting from S/N = 0 dB, as the signal-to-noise ratio 
S/N improves progressively, increased noise reduction takes place up to a 
maximum g mQX a. 25 dB at approximately S/N 12 dB. If S/N increases 
further, the degree of noise reduction finally falls towards zero so that when 
little background noise is present, as little manipulation of the useful signal 
transmitted will take place. 
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Patent Claims 



1 . A method of reducing echo and/or noise signals in telecommunications 
systems for transmitting useful acoustic signals, particularly human 
speech, comprising determining by silence detection when the mixture of 
useful signals and interference signals contains a speech signal or when 
a silence interval is present, and varying, by means of a two-input 
multiplier, the amplitude of the useful signals, which are generally 
disturbed by echo and/or noise signals, in response to a time-dependent 
control signal a 0 (t) or a control signal a 0 (k) clocked at a sampling rate 
f T = 1 /T, where 4eK denotes the number of samples, and T denotes the 
period from one sample to the next, 

characterized in 

that the control signal a 0 (t) or a 0 (k) is varied in such a way that in the 
presence of speech signals in the useful signal, the amplitude of the 
control signal a 0 (t) or a 0 (k) is set to a predetermined constant value c 0 , 
that from the beginning of a silence interval in the useful signal, the 
amplitude of the control signal a 0 (t) or a 0 (k) is continuously reduced from 
one sample to the next according to the recursion formula 

a 0 (k + 1) = a 0 (k) * , where < 1 , 

and that after the end of a silence interval, a 0 (k) is set equal to c 0 . 
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2. A method as claimed in claim 1, characterized in that the factor f3 is 
determined from the sampling rate f T , a time constant r l , and a 
predefined constant factor q according to the relation 

P = c x .exp(-l/T,/ r ). 

3. A method as claimed in claim 2, characterized in that the time constant 
x 7 is chosen to be between 50 ms and 150 ms, preferably « 65 ms. 

4. A method as claimed in any one of the preceding claims, characterized 
in that the constant value c 0 is chosen to be equal to 1 . 

5. A method as claimed in any one of the preceding claims, characterized 
in that during a silence interval and/or in the presence of an echo signal 
a 0 (k+l) assumes a predefined constant value c 2 if the preceding value 
a 0 (k) has become less than or equal to c 2 . 

6. A method as claimed in any one of claims! to 4, characterized in that 
during a silence interval and/or in the presence of an echo signal and 
for a 0 (k) < c 2/ where c 2 is a predefined constant, the power value of the 
noise level N in the communications channel currently being used is 
continuously measured and/or estimated, and that depending on the 
current noise level N, the control signal a 0 (k+1) is continuously adjusted 
according to a 0 (k+l) = f(N), where f(N) is a predetermined function of 
N. 

7. A method as claimed in claim 6, characterized in that the predetermined 
function f(N) is a function g(S/N), which depends on the quotient S/N of 
the power value of the signal level S of the useful signals to be 
transmitted and the power value of the noise level N, or that the 
predetermined function f(N) is a function g'fN/S), which depends on the 
reciprocal of said quotient. 

8. A method as claimed in claim 7, characterized in that, if 
1/N<< 1 or S/N = 0 dB, the function f(N) or g(S/N) begins with a 
constant value f 0 > 0 or g 0 > 0, respectively, rises to a maximum f mox or 
g max in the range between N or S/N = 10 dB to 15 dB, respectively, 
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preferably at N or S/N « 12 dB, respectively, and then decreases to a 
minimum value f min or g min/ respectively, preferably to 0 dB, where 
5 dB < f 0/ g 0 < 1 0 dB, preferably 6 dB < f 0 , g 0 < 8 dB, and where 20 
dB < f max , g max < 30 dB, preferably f max , g max «25 dB. 

9. A method as claimed in any one of claims 6 to 8, characterized in that 
the function f(N) or g(S/N) is linear at least in sections, preferably in all 
its sections. 

10. A method as claimed in any one of claims 6 to 8, characterized in that 
the function f(N) or g(S/N) consists of polynomials and is a skewed bell- 
shaped curve. 

11. A method as claimed in any one of claims 6 to 10, characterized in that 
the functions f(N) and g(S/N) or g'(N/S) are chosen such that the 
reduction of the noise level N is aurally compensated in accordance with 
the psychoacoustic mean value of the spectrum audible by the human 
ear. 

12. A method as claimed in any one of the preceding claims, characterized 
in that in addition to the detection and reduction of noise signals, the 
presence of echo signals is detected and/or predicted, and that the echo 
signals are suppressed or reduced. 

13. A method as claimed in claim 1 2 and in any one of claims 6 to 1 1 , 
characterized in that the control signal a 0 (k+l) is continuously adjusted 
according to a 0 (k-h1) = h(N, S, ES, x E , ERL), where h(N, S, ES, t E , ERL) is 
a predetermined function of the noise level N, the signal level S, the 
useful signal ES in the opposite direction from a speaking party, the 
constant delay t e of the echo signal, and an attenuation constant ERL of 
the amplitude of the echo signal. 

14. A method as claimed in claim 12, characterized in that the reduction of 
noise signals and the reduction of echo signals are controlled separately. 

15. A method as claimed in any one of claims 12 to 14, characterized in that 
during the time of an echo reduction, an artificial noise signal is added 
to the useful signal. 
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16. A method as claimed in claim 15, characterized in that the artificial noise 
signal comprises an acoustic signal sequence perceived to be 
psychoacoustically pleasant (= comfort noise). 

1 7. A method as claimed in claim 15, characterized in that the artificial noise 
signal comprises a noise signal previously recorded during the current 
communication. 

18. A method as claimed in any one of the preceding claims, characterized 
in 

that in a silence detector (SPD), a short-time output signal sam(x} / a 
medium-time output signal mam(x), and a long-time output signal lam(x) 
are formed by means of a short-time level estimator, a medium-time 
level estimator, and a long-time level estimator, respectively, 
that the three output signals sam(x), mam(x), and lam(x) are so adjusted 
via suitable amplification coefficients that they are approximately equal 
in magnitude when the input signal x is a pure noise signal, with 
sam(x) < mam(x) < lam(x), 

that the three output signals sam(x), mam(x), and lam(x) are monitored 
by comparators, and 

that the presence of a speech signal as the input signal x is assumed 
when both sam(x) and mam(x) first become larger than lam(x), while the 
presence of a silence interval is assumed when thereafter sam(x) and/or 
mam(x) become smaller than lam(x). 

19. A method as claimed in claim 1 8, characterized in that for silence 
interval estimation, the three output signals sam(x), mam(x), and lam(x) 
are fed to a neural network which was trained with a plurality of 
scenarios with different input signals x. 

20. A method as claimed in any one of the preceding claims, characterized 
in that the useful signal to be transmitted is subjected to a spectral 
subtraction. 

21 . A method as claimed in any one of the preceding claims, characterized 
in that the useful signal to be transmitted is subjected to spectral filtering 
adapted to the sense of human hearing. 
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22. A server unit for supporting the method claimed in any one of claims 
1 to 21. 

23. A computer program for carrying out the method claimed in any one of 
claims 1 to 21 . 
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Summary 

Method for the reduction of echo and/or noise signals in TK systems for the 
transmission of useful acoustic signals, in which it is determined by means 
of silence interval detection when a silence interval is present, and the 
distorted useful signal is then modified by a time-dependent control signal 
a Q (t)m or by a control signal a c (k) cycled in the rhythm of a scan rate f T = 
1/T. The method is characterised in that the control signal a Q (k) is varied in 
such manner that during the presence of speech signals in the useful signal 
the amplitude of the control signal a Q (k) is set to a predetermined constant 
value c Q and when a silence interval begins the amplitude of the control 
signal a c (k) is reduced continuously from one sample value to the next in 
accordance with the recurrence formula a 0 (k + 1) = a 0 (k) .{3 with p < 1. 
After the end of the silence interval a Q (k) is again set equal to c Q . In this 
way, echo and noise attenuation can be effected simply, cost-effectively, 
without great computational effort, and with modest need for computer 
memory and data storage space. With simple means, the said echo and 
noise reduction produce an overall impression acoustically as pleasant as 
possible for the human ear, which can be adapted to individual needs 
according to taste. 



(Fig.l). 
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